Joint modeling HIV and HPV using a new hybrid agent-based network and compartmental simulation technique

Persons living with human immunodeficiency virus (HIV) have a disproportionately higher burden of human papillomavirus infection (HPV)-related cancers. Causal factors include both behavioral and biological. While pharmaceutical and care support interventions help address biological risk of coinfection, as social conditions are common drivers of behaviors, structural interventions are key part of behavioral interventions. Our objective is to develop a joint HIV-HPV model to evaluate the contribution of each factor, to subsequently inform intervention analyses. While compartmental modeling is sufficient for faster spreading HPV, network modeling is suitable for slower spreading HIV. However, using network modeling for jointly modeling HIV and HPV can generate computational complexities given their vastly varying disease epidemiology and disease burden across sub-population groups. We applied a recently developed mixed agent-based compartmental (MAC) simulation technique, which simulates persons with at least one slower spreading disease and their immediate contacts as agents in a network, and all other persons including those with faster spreading diseases in a compartmental model, with an evolving contact network algorithm maintaining the dynamics between the two models. We simulated HIV and HPV in the U.S. among heterosexual female, heterosexual male, and men who have sex with men (men only and men and women) (MSM), sub-populations that mix but have varying HIV burden, and cervical cancer among women. We conducted numerical analyses to evaluate the contribution of behavioral and biological factors to risk of cervical cancer among women with HIV. The model outputs for HIV, HPV, and cervical cancer compared well with surveillance estimates. Model estimates for relative prevalence of HPV (1.67 times) and relative incidence of cervical cancer (3.6 times), among women with HIV compared to women without, were also similar to that reported in observational studies in the literature. The fraction attributed to biological factors ranged from 22–38% for increased HPV prevalence and 80% for increased cervical cancer incidence, the remaining attributed to behavioral. The attribution of both behavioral and biological factors to increased HPV prevalence and cervical cancer incidence suggest the need for behavioral, structural, and pharmaceutical interventions. Validity of model results related to both individual and joint disease metrics serves as proof-of-concept of the MAC simulation technique. Understanding the contribution of behavioral and biological factors of risk helps inform interventions. Future work can expand the model to simulate sexual and care behaviors as functions of social conditions to jointly evaluate behavioral, structural, and pharmaceutical interventions for HIV and cervical cancer prevention.


A1 OVERVIEW OF MAC
The mixed agent-based compartmental (MAC) simulation framework was developed for comodeling of diseases spread over a common contact network but have varying levels of prevalence and incidence that neither agent-based nor compartmental model alone is sufficient.The computational framework for the hybrid agent-based compartmental simulation was enabled through use of a recently developed agent-based evolving network modeling technique (ABENM) (Eden et al., 2021), applied to the development of the Progression and Transmission of HIV (PATH 4.0) model in the U.S. and validated against data from the National HIV Surveillance Systems (NHSS) (Singh et al., 2021).The main concept of ABENM is to simulate only persons infected with at least one lower prevalence disease and their immediate contacts at the individual level as agents of the simulation, and to model all other persons including those with higher prevalence diseases using a compartmental modeling structure.Immediate contacts are defined as all partners a person will have over their lifetime who at the current time-step may be infected or susceptible.
As these contacts in the network become infected with a lower prevalence disease, their immediate contacts are added as agents to the network (transitioning from the compartmental portion of the model to the network portion of the model), thus evolving the contact network.An Evolving Contact Network Algorithm (ECNA) maintains the network dynamics.Here we provide an overview of the MAC simulation framework, without loss of generality, using a two-disease example, lower prevalence Disease 1 and higher prevalence Disease 2. In the analyses presented in the main paper, we modeled HIV as Disease 1, adopting the validated PATH 4.0 model (Singh et al., 2021) and HPV.We believe this framework can be generalized to any number of diseases, the computational complexity and relevance informing decisions for modeling it in the ABENM or in the compartmental.

A1.1 OVERVIEW OF MAC USING LOWER PREVALENCE DISEASE 1 AND HIGHER PREVALENCE DISEASE 2
We present the framework using HIV as Disease 1 and a hypothetical Disease 2 for simplicity.
More details on modeling HPV can be found in Section 2.2.We track HIV-infected persons and immediate contacts using a dynamic graph   (, ℰ), with the number of nodes in the graph   = |(  )| and the number of edges |ℰ(  )| dynamically changing over time  as persons become newly infected with HIV and their immediate contacts are added to the network.Each person in the network has attributes: age, transmission-group (heterosexual female (HETF), heterosexual male (HETM), men who have sex with men (men only and men and women) (MSM), degree (number of lifetime partnerships), geographic jurisdiction, health status (Disease 1 no Disease 2, Disease 1 Disease 2, no Disease 1 no Disease 2), and HIV-related disease and care continuum stages.
We track all other persons using an array   of size  ×  ×  ×   .
Where,  is the number of age-groups,  is the number of transmission risk groups,  is the number of degree-bins (degree is the number of contacts per person, degrees are grouped into bins analogous to age grouped into age-groups),  is the number of geographic jurisdictions (equal to 1 here corresponding to national), and  is the number of health states (two in numerical analyses: no Disease 1 no Disease 2, and Disease 1 no Disease 2).
Therefore, each element of the array (  [ ̅, ,  ̅ , , ℎ]) is the number of people in that specific category, and thus,   + ∑  , ∈ ̅,, ̅ ,,ℎ , would be the total number of people in the population at time .
As all HIV infected persons and exposed partners are in the network, HIV transmissions and HIV disease progression are modeled at the individual level (using HIV transmission and HIV disease progression modules in Section 0).Disease 2 transmission and progression are modeled using differential equations (as typically done in compartmental modeling technique) but with the consideration that people in both the network   (, ℰ) and compartmental array   can be infected with Disease 2 (see compartmental module in Section 0).
The mathematical challenges to address for this method to work is to maintain the dynamics between the network   (, ℰ) and compartmental array   , including, transitioning people from   to   (, ℰ).Specifically, determining 'who', i.e., the degree, transmission risk group, age, and geographical location of the persons, are to be added as the immediate contacts of the node newly infected with HIV, as these characteristics of infected persons and their contacts are known to be correlated (Liljeros et al., 2001).Upon determining and adding all lifetime partners of nodes newly infected with HIV, we need to determine when the partnerships would initiate and dissolve, including the age of both partners at that time, such that the overall dynamics of age-mixing and risk-group-mixing of the resulting network match that of the U.S. population over time.As network generation techniques in commonly used agent-based models are designed to generate the network between the full population (susceptible and infected persons), they cannot be adopted here.We developed an evolving contact network algorithm (ECNA) for modeling the transitions from   to   (, ℰ) (see ECNA module in Section 00.3).As HIV is a chronic infection, once persons enter the network   (, ℰ) they do not transition back to   .However, modeling such a transition (for cases of lower prevalence disease that are not chronic) would be simple as they can be added to the compartments corresponding to their group.
We discuss the computational structure of MAC in Section 1.2 and the four simulation modules in Section 1.3.A visual representation of the computational structure is presented in Figure 1 in manuscripts.All notations used in the model are also summarized in Table A1.

A1.2 COMPUTATIONAL STRUCTURE OF MAC
As noted above, following the compartmental modeling structure, we use a five-dimensional array is a column of zeros if the node is HIV-infected as all their partnerships are already assigned, and greater than or equal to zero if the node is HIV-susceptible (when the HIVsusceptible person is added as a contact of a different HIV-infected person one of the rows is decremented, and when the HIV-susceptible person becomes HIV-infected all rows of column 2 are decremented to zero as their partners are found and added -the HIV-ECNA was specifically developed for determining when and how to assign these partnerships, and thus generating the network, which is discussed in Section 1.This section presented the computational structure of the model, specifically the compartmental modeling structure, the network structure, and the features of the nodes and edges in the network.The next section describes the methods (modules) used in simulating these features.
We will discuss HPV and cervical cancer model, calibration and validation in Section 2 and Section 0 provides an overview of the steps of the simulation.

A1.3 FOUR MAIN MODULES OF MAC
We present the overall MAC framework through four modules, that are run at every time-step (monthly) of the simulation: a compartmental module for simulating Disease 2, a Bernoulli transmission module for simulating new infections, the ECNA network generation module for generating partnership networks of new HIV-infected persons, and a disease progression module for simulating HIV-related events for HIV-infected persons.

A1.3.1 Compartmental module for simulating higher prevalence disease 2
This module updates the demographic features (births, aging, and deaths) and Disease 2 features (transmission and progression) of persons tracked through the array   , using difference equations as follows.
=  −1 + health state, and thus,   is of size 7 × 3 × 9 × 1 × 41 × 41.Details of the generator matrix specific to HPV is presented in Section 2. For a generalized understanding of this equation, we make a simplified assumption of a single disease stage, i.e., ℎ ̅ = 0 for disease free and ℎ ̅ = 1 for Disease 2 (D2).Thus, we can expand the equation as follows: Each element on the right-hand side can be calculated as follows, where, In this MAC approach, We then randomly choose  2 from node-set {  ̅,, ̅ ,,ℎ=0 } and set their Disease 2 status ℎ ̅ = 1, and randomly choose  2 from node-set {  ̅,, ̅ ,,ℎ=1 } and set their Disease 2 status ℎ ̅ = 0.For the HPV model, such calculations would be conducted for each of the 41 disease stages.

A1.3.2 Transmission module for simulating new disease 1 infections in network (HIVinfection)
This module determines if a HIV-susceptible node  in the network   (, ℰ) becomes infected using a Bernoulli transmission equation and individual-level sexual behaviors and transmission risk factors, each factor modeled as functions of demographic features, and HIV testing and treatment status of infected contacts.Note, as persons in the compartmental model array   are not connected to an HIV-infected person, their chance of infection is zero.However, note, persons can move from   to   (, ℰ) upon becoming partners of an HIV-infected person, modeled using the HIV-ECNA algorithm discussed in the next section, which would then expose them to HIV infection.Specifically, every time-step (monthly) of the simulation, this module determines if a HIV-susceptible node  in graph   (, ℰ) becomes HIV-infected i.e., for nodes with HIV infection status  t−1, = 0 it estimates its updated value  t, as follows.
where,   =   [𝑙, 𝑗]. t,j . , . , , where   [, ],  t,j ,  , are the elements of the graph described in Section 0, and  , is the probability of transmission per act modeled as a function of disease and care stage  , and transmission risk group   of the infected node ; we will have a value of   =  , if  is a contact of  (i.e.,   [, ] = 1), is infected (i.e.,  t,j = 1), and is alive (i.e.,  , = 1), and   = 0 otherwise,  = condom effectiveness,  , = number of sex acts per month with node , modeled as a function of age, transmission risk group, and number of partners of node ,  , = proportion of acts with node  that is condom protected, modeled as a function of age, transmission risk group, and number of partners of node ,  −1 () = an inverse Bernoulli distribution that takes a value of 1 with probability  and value of 0 with probability 1 − .
If node  becomes infected, i.e., if the above equation yields  t, = 1, we set its HIV stage  , = 1.

A1.3.3 ECNA network generation module for generating partnerships of new HIV-infected nodes
This module controls the overall network dynamics of partnerships between nodes.The main functionality is to generate the contact network for each new HIV-infected node .The steps of the ECNA are as follows.
For every new HIV-infected node , 1. Determine the number of new partnerships (edges) to generate as actual degree minus current degree.Note, these new partnerships would all be with HIV-susceptible persons as any partnerships with HIV-infected were already added when networks of those HIV-infected persons were created.
2. For each new HIV-susceptible partner node, determine node features: number of lifetime partners using ECNA (discussed below), risk-group, current age-group, and pseudo-geographic jurisdiction using conditional probability distributions, and partnership distribution using a two-step Markov process (discussed below).
3. For each partnership, determine age of both partners and simulation times at partnership with features matching that in steps 2 and 3 above, can be eligible.

5.
For each new partner and partnership (determined in previous steps) update their corresponding features in the network   (, ℰ) and compartmental array   .

ECNA for determining degree of susceptible partner nodes
For a node , the degree-bin of a partner  ̅  is not independent of its degree-bin  ̅  because of degree correlations between node neighbors, a key feature of scale-free networks (Fotouhi & Rabbat, 2013), i.ePr( ̅  =  ̅  | ̅  =  ̅  ) ≠ Pr( ̅  =  ̅  ) ; where   ̅̅̅̅ is the random variable for degree-bin of node ,and thus,  ̅  cannot be directly drawn from its scale-free network probability mass function.While the literature presents an analytical method for estimation of Pr( ̅  =  ̅  | ̅  =  ̅  ) for general static scale-free networks (Fotouhi & Rabbat, 2013), this method is not suitable for simulating an epidemic in a dynamically evolving contagion network (Eden et al., 2021).Specifically, in general static scale-free networks, the full network is available so the degree of all node neighbors is available, and thus Pr( ̅  =  ̅  | ̅  =  ̅  ) is an expectation over all possible values of  ̅  , i.e., an average over "all" node neighbors.However, as we only simulate HIV-infected nodes and their immediate contacts in the network, in our context, only values corresponding to HIV-infected node neighbors are used.As it is more likely that nodes with higher degree get infected first, the value of Pr( ̅  =  ̅  | ̅  =  ̅  ) when = 'all node neighbors' is different compared to when = 'HIV-infected node neighbors'.And further, Pr( ̅  =  ̅  | ̅  =  ̅  ) is likely to change over time as the HIV epidemic spreads and the percent of the population that is HIV-infected changes.We developed a neural network model for the prediction of Pr( ̅  =  ̅  | ̅  =  ̅  ) using as independent variables,  ̅  ,  ̅  , the minimum degree of the network m, the percent of the population that is infected, and the scale-free network parameter    corresponding to transmission risk group of node  (  ).Details of this algorithm are presented in (Eden et al., 2021).

Two-step Markov process for determining the partnership distribution matrix for each new partner
Suppose   is the actual degree (number of lifetime partners) of a newly added susceptible node  (Singh et al., 2021).

A1.3.4 Disease progression module for simulating disease 1 progression for nodes in the network
The disease progression module updates the individual-level demographic and disease dynamics for every HIV-infected person in the network just like an agent-based model.This includes aging, Disease 1-related and natural mortality, progression through Disease 1 disease and care stages.For the analyses in the main paper, we adopted the disease progression module from PATH 2.0 (Gopalappa et al., 2016).

A2.1 HPV NATURAL HISTORY MODEL
The natural history of HPV and cervical cancer incidence and progression is presented in Figure A1 and was adapted from models in the literature (Brisson, 2019;Gopalappa et al., 2018;Lee & Tameru, 2012;Xia et al., 2019).For women, once a person is infected with HPV, the infection can persist, clear naturally by developing type-specific immunity or progress to cervical intraepithelial neoplasia grades I, II and III (CIN1, CIN2, CIN3); CIN 3 can then progress to invasive cervical cancer, progressing sequentially to localized, regional, and distant cancer stage.
We only modeled cervical cancer that initiate as HPV infection.Persons diagnosed with cervical cancer, detected through either screening or natural symptoms, move to a diagnosed stage and face a certain chance of stage-specific mortality (the label in the figure corresponds to the stage at diagnoses).We stratified HPV types into three groups: HPV 16/18 pooled, HPV-31/33/45/52/58 pooled (high risk HPV types included in HPV 9 vaccine); other high-risk HPV pooled, thus we only modeled HPV-induced cervical carcinogenesis associated with all high-risk HPV types in this work.We did not model low-risk HPV types as their prevalence and risk of progression to cancer is low.We assume each HPV genotype group is independent of others, i.e., one cannot infect with multiple genotypes at the same time.We also assume infection-acquired immunity will wane over time while vaccine-acquired immunity stays life-long, and vaccine has 100% efficacy against corresponding HPV genotypes.
For MSM, we simulated HPV and anal cancer using the same epidemiology as HPV and cervical cancer (Darragh & Winkler, 2011).For heterosexual males, we only modeled genotype-specific HPV infection and natural immunity (Figure A2).We first present the mathematical formulation of this natural history model, followed by data assumptions for incidence and progression.

A2.2 OVERVIEW OF COMPARTMENTAL MODEL STRUCTURE FOR HPV
As noted in Section 1, the general equation for the compartmental model can be written as, The 41 health states of the compartmental model include a disease-free state (ℎ ̅ =0), and 14 disease states, ℎ ̅ = s τ ∈ {1, … ,14} for each of the three HPV genotypes τ ∈ {1,2,3}, such that ℎ ̅ = 1 τ is the first stage of the disease for HPV genotype τ.Thus, we can represent the force of infection as  , (0, 1 τ ) =  , (0, 1  ), which we calculate using the method in Section 2.3.All other state transitions are as in a general Markov chain (see Section 1.3.1 for the simple two disease stage transition).

A2.3 HPV TRANSMISSION
We simulated HPV transmission as a function of partnership acquisition and dissolution, by transmission risk group, age group, and degree bin.HPV infection can be transmitted, depending on the number of partner change per time unit, type-specific HPV prevalence among the group of partners, probabilities of HPV transmission per sex act, number of sex acts per time unit and number of partners per time unit as a proxy to duration of partnership.We present the mathematical formulation below.

A2.3.1 Force of infection
The overall force of infection  , (0, 1  ) for persons in risk-group  at time  for HPV type , is calculated using

A2.3.2 Transmission probability per partnership
As is common, the transmission probability per partnership is calculated by using a Bernoulli equation.We assume it is time-invariant

A2.3.3 Partnership mixing pattern
We calculate   ′ , the mixing matrix for a person in risk-group  with persons in risk-group ′ per time unit using time-invariant mixing matrices as follows where,

A2.4.1 Natural history parameters
The natural history model for stage transitions was adopted from (Gopalappa et al., 2018).
Progression and regression rates for the stage transitions were adopted from (Gopalappa et al., 2018;Johnson et al., 2012;Tan et al., 2018).Progression rates in cervical cancer stages, mortality rates by cancer stage, and probability of diagnoses by cancer stage can vary based on access to care and treatment (e.g., persons seeking care upon show of symptoms), so we took data specific to US from SEER database which is presented in (Brisson, 2019).Due to lack of data for anal cancer specific stage transition rates among general population, we assumed same rates as for cervical cancer.

A2.4.2 Sexual behavior parameters
To be consistent with the sexual behavior in the network model, we calculated or took the following parameters from the PATH 4.0 which were originally from (Broz et al., 2014;Center for Disease Control and Prevention, 2010;Reece et al., 2010;Singh et al., 2021): Partnership change rate: The partnership change rate was calculated using the following equation:

𝑝 𝑑,𝑟 (𝑎) × |𝑑| |𝑎|
, () represents the probability of partnership initiation at age group  for people in degree bin  and sexual transmission risk group , which was calculated using the estimated proportion of partnerships initiated by age-group for persons with lifetime-partners in degree bin.|| is the number of lifetime partners in degree bin  and we took an average of the degree bin range.For example, if  represents degree bin 5  8, then || = 5+8 2 = 6.5.|| is the time interval representing the width of the age-group.

Number of annual partners:
We used simulation techniques to estimate expected number of annual partners.We made following assumptions to ensure the output would be consistent with sexual behavior in the network model: -An existing partnership without concurrency will end when the next partnership initiates or until one reaches the last age.
-An existing partnership with concurrency will end  unit of time after the next partnership initiates, where  follows an exponential distribution.

Mixing matrix:
The age-group mixing matrix was calculated by observed data that was extracted from the report (Glick et al., 2012).We assume sexual contact network follows scalefree network structures, thus given one's degree, their neighbor's degree can be calculated by using the degree correlation equation derived by (Fotouhi & Rabbat, 2013).The degree bin mixing matrix is then calculated by combining each individual degree together to logarithm-2 bin.The transmission risk group mixing matrix was obtained by assuming heterosexual males only have sex with heterosexual females and 20% of partners is heterosexual females among 29% of MSM who are bi-sexual (Finlayson et al., 2011;Friedman et al., 2014;Rosenberg et al., 2011;Voetsch et al., 2012).
Other parameters: All other parameters were either from the literature directly or from estimation and assumption presented in (Singh et al., 2021).The data includes proportion of condom-protected acts for main and casual partnership, number of sex acts per year, condom effectiveness.We assume condom effectiveness for HPV is the same as that for HIV.We also assume in the compartment model, proportion of condom-protected acts for main partnership is applied to the group where annual partnership is less than or equal to one, otherwise proportion of condom-protected acts for casual partnership is applied.

A2.5 CALIBRATION AND VALIDATION
We calibrated the transmission probability per sex act to match epidemiological data targets used in the CISNET study (Burger, de Kok, et al., 2020), specifically, HPV prevalence and cervical cancer incidence and mortality from the era prior to implementation of HPV and cervical cancer screening.To do so we simulated the model to steady state by doing a dry-run for 70 years at 6month time steps and repeating this with different probabilities of transmission until we found the set of values that gave a good fit to the following data metrics.
• Pre-screening age-specific HPV prevalence among normal cytology from New Mexico HPV Pap Registry study (Joste et al., 2015).
• Pre-screening age-specific HPV genotype frequency among normal cytology obtained from New Mexico HPV Pap Registry study (Joste et al., 2015).
To validate the model, we compared our model outcome with cervical cancer incidence and mortality from the Connecticut Tumor Registry (CTR) for the period 1950-1969 (Laskey et al., 1976) (See Figure 3a in the main manuscript).We also used the above calibrated model and screening rates from year 2006 in the U.S., to simulate HPV prevalence and cervical cancer incidence to steady state by doing a dry-run for 30 years at 6-month time steps.Specifically, we adopted age-specific cervical cancer screening utilization rate in 2006 (Burger, Smith, et al., 2020), and screening sensitivity, proportions receiving follow-up colposcopy/biopsy and proportion receiving precancer treatment for those in pre-cancer stages, and additionally cancerscreening sensitivity for those in cancer stages as per (Burger, Smith, et al., 2020;Kim et al., 2015).Our model output is consistent with cervical cancer incidence and mortality reported more recently (National Cancer Institute, 2016) (See Figure 3b in the main manuscript).

A3 MODEL INITIALIZATION AND DRY RUN
We can initialize the model to be representative of people in a population in a specific year.For the analyses in the main simulation, we initialized age, risk-group, degree distributions as per persons in the U.S in 2006 in both compartmental model and network.For distribution by Disease 2 health status in the compartmental model and network, we randomly selected 30% of persons in each category as infected with Disease 2, and dry running the simulations so that the state distributions reach a steady state.We initialized the network for Disease 1 to be representative of HIV in the US in the year 2006 through using two dry runs to ensure network dynamics are generated in addition to epidemic and demographic distributions.
Dry run is a technique used for initialization of the model.It involves running the simulation for a certain period of time, but the data generated over that period of run is not representative of an actual epidemic projection and thus referred to as a dry run.For a pure agent-based model (without networks), we could initialize the model with a few agents and assign parameters to match surveillance data for the demographic, disease stage, and care-continuum stage distributions.But the agents would be lacking the 'history', e.g., the age at infection, and the age and stage of ART initiation, relevant for modeling future events.Therefore, we can do a dry run, which means starting with some number of people, assigning them data according to a specific year, say 2006 surveillance distributions as done here for HIV, running the simulation for several years while maintaining the distributions to match that of 2006.As persons become newly infected, their history is being generated, and the initial persons from day 0 age-out of the model.The above method is also sufficient for diseases modeled in the compartmental model (Disease 2 here).
However, for diseases modeled in ABENM (Disease 1 here), the network adds another layer of complexity as static data is infeasible (and moreover, rarely available) for determining the contact network structure, including the links of HIV-infected persons to each of their lifetime partners, the current age of partners, the initiation and termination times of the links, the initiation and termination age of nodes at both ends of a link, the risk-group of the partner, and the infection status of partner.Therefore, for Disease 1, we first do a dry run to allow for the network to dynamically grow over time (Dry run #1), and then apply the data from surveillance to set the demographic, and disease and care-continuum stages (Dry run #2).Dry runs are explained in more detail in (Singh et al., 2021) as applied to HIV and validated against NHSS, a similar method can be applied for other diseases modeled.

A4 OVERVIEW OF SIMULATION MODELING STEPS
We provide an overview of the full simulation model in this section.All notations used in the model are also summarized in Table A1.
Step 1: Initialization of the compartmental model array  =0 and graph  =0 (, ℰ) at time-step  = 0 Step 1a.We initialize the compartmental model array to be representative of the U.S.
population by age, transmission risk group, and degree distribution in 2006.We distributed the total population in the U.S. into degree bins by using (2 , assuming minimum degree  = 2 1 and maximum degree  ̅ = 2 7 , and estimating the power-law parameter λ from life-time partner distribution data for MSM and heterosexuals (Glick et al., 2012).Within each degree-bin, we distributed the population into seven age-groups, ranging from age 13 to 65, using US census data for distribution by age, and further by risk-group (heterosexual male, heterosexual female, or MSM) using population size estimates for MSM from (Centers for Disease Control andPrevention, 2012, 2015).
Step 1b.We initiate a random graph of 1500 nodes and zero edges.For each node, we set their HIV status as newly infected, allocate a degree by randomly drawing from the overall degree distribution, and assign Disease 1-related care continuum and disease stages, age, and transmission risk group through random draws from the corresponding distributions of HIV in the US in 2006.
Step 2: Dry run to make the network and disease state-distribution representative of the population of interest Dry run the model for a certain number of time-steps (here we chose 200 monthly time-steps) by running the following steps sequentially for every time-step.
Step 2a.Run the ECNA module (Section 0) to generate the network of contacts for every new HIV-infected node in the network.
Step 2b.Run the compartmental module (Section 0) to update Disease 2 parameters in the compartmental model and network.
Step 2c.Run the transmission module (Section 0) to generate new Disease 1 infections in the network.
Step 2d.Run the disease progression module (Section 0) to update demographics and Disease 1 progression and care parameters for nodes in the network.
Repeat Steps 2a to 2d, for the required number of months.For the analyses in the paper, we

𝐴
The number of age-groups.

𝑅
The number of risk-groups.

𝐷
The number of degree bins.

𝒢
The number of pseudo-geographic jurisdictions.

𝐻
The number of health states related to Disease 2.

𝑅[𝑎 ̅, 𝑘]
A binary matrix of size  × (  −  ̂t,l ), which represents whether the partnership with partner  would occur when partner  is in age-group  ̅.

𝑛[𝑖]
A vector of size A, denoting the number of partnerships that should initiate when the partner is in age-group i.

𝑒 𝑘 [𝑖]
A vector of size A, representing node k to be eligible to initiate a partnership at age-group i.
, ̅ ,,ℎ=1 }| i.e.,   would be the sum of Disease 2 infected persons in group  (  [ ̅, ,  ̅ , , ℎ = 1]) in the compartmental model and the number of nodes in the network in group  (|{  ̅,, ̅ ,,ℎ=1 }|), and similarly,   would be the sum of persons in group  in the compartmental model and number of nodes in group  in the network with Disease 2 (ℎ ̅ = 1) and without Disease 2 (ℎ ̅ = 0).Simulating Disease 2 among HIV infected persons in the network   (, ℰ) Continuing with the simple singe stage Disease 2 demonstration example, we can use the above transition rates (infection rate   ̅,̅ , ̅ , and progression rate ) to simulate Disease 2 among nodes in the network.We first convert rates to probability, as  = 1 −  −.1 (assuming sojourn times follow exponential distribution).We then use a binomial distribution to determine number of nodes to transition, specifically, number of nodes newly infected with Disease 2 = 2 ~Binomial( |{  ̅,, ̅ ,,ℎ=0 }|, 1 −  −  ̅,, ̅ , .1 ) number of nodes newly recovered from Disease 2 = 2 ~Binomial( |{  ̅,, ̅ ,,ℎ=1 }|, 1 −  − .1 ) initiation and termination, by formulating and solving as assignment optimization model (discussed below).4. Determine who each new partner is by a uniform random draw from all who are eligible, i.e., all persons who are eligible have an equal chance of selection.All HIV-susceptible nodes in the network   (, ℰ) and HIV-susceptible non-agents in the compartmental model array simulated years 2006 to 2017 in monthly time-steps for Disease 1-and 6-month time-steps for Disease 2.

Figure
Figure A1 HPV and cancer disease diagram for HETF and MSM.

Figure
Figure A2 HPV natural history diagram for HETM.

Disease 2-infection status: ℎ
̅ , = 1 if node  is Disease 2-infected and ℎ ̅ , = 0 otherwise Deceased status:  , = 1 if node  is alive and 0 otherwise, Age:  , taking an integer value representative of the age of node ,   , at any , i.e., as  ,[, 1]tracks number of partnerships that initiate at age-group , when summed over all  it should add to the actual degree   for all nodes = is the per contact transmission probability,   is the number of contacts,   is partnership mixing of group  with group  (here group represents a combination of  ̅, ̅ ,  ̅ , ),   is the number of persons infected with Disease 2 in group , and   is the number of persons in group .
is the force of infection for persons in age-group  ̅, transmission risk group ̅ , degree-bin  ̅ , pseudo-geographic jurisdiction  at time ,  is the recovery rate (in these hypothetical analyses we assumed it is static, but could be varied as a function of  ̅, , , to represent epidemiological differences by demography or as a function of time  to represent changes in care or treatment over time, | ̅| is the age-interval of age-group  ̅, and   ̅ is mortality rate as a function of age-group  ̅ (here we only modeled all-cause mortality, but it could be varied as a function of health status ℎ ̅ ).For this simple single Disease 2 demonstration example, we can write force of infection   ̅,, ̅ , as follows, (for clarity, we replace   ̅,, ̅ , =  in subscript to denote the group of reference)  =     ∑ .We need to determine at what age of  will each partnership initiate.Specifically, suppose there is a matrix  ̅ of size  × , with  the number of age-groups and  the number of degree-bins, partnerships that initiate at age  ̅, as   [ ̅, 1] =  ̅ [ ̅,  ̅ ].   .Direct data for  ̅ would be a longitudinal survey over the duration of life of an individual, where the individual reports the number of partnerships they initiated at every age point of their life.Such surveys, however, are unavailable.Therefore, we estimated  ̅ using survey data on the reported number of partners up to that time by persons of different age-groups.Note that, these survey data only represent the number of partners up to the current age of the surveyed individual.Thus, the degree-bin  ̅ each person would belong to is unknown as  ̅ represents the number of partners the person would eventually have over their full lifetime.The age at which each partnership initiated is also unknown.As data for  ̅ are not directly available, we developed a two-step estimation process.In step 1, we solved for the probabilities of initiating a new partnership when a person ages from age-group  ̅ − 1 to  ̅, by formulating it as the transition probability matrix of a Markov chain and solving for it using an optimization model, calibrating to the survey data.In step 2, we used the transition We formulated the problem of assigning age at partnership initiation as a variant of an assignment optimization model.Suppose  []is the age of partnership initiation for  ℎ partner of node  (|  | =   the degree of node ).Then the objective is to assign each element (and all elements) of   of a newly infected node  to one (and-only-one) element of   for each partner , with constraints to maintain the probability distribution for age-mixing between partners, and partnership distribution matrices ( , and  , , ∀) of the newly infected node  and each partner  .However, solving it using standard solvers for each infected person is computationally expensive, and therefore, we developed a heuristic solution algorithm.Solving for partnership initiation age for each new partner sets the focal point for assigning the other three parameters partnership initiation time, partnership termination age, and partnership termination time.Details of this method are presented in (Singh et al., 2021) ̅ ] representing the proportion of partnerships that initiate at age-group  ̅ for persons in degree-bin  ̅ , with each column of  ̅ adding to 1, for all  ̅ ∈ {1,2, … }.Then, for any node  with actual degree    ̅ , we can calculate the partnership distribution matrix, i.e., the number of probability matrix from step 1 to simulate partnership changes over the lifetime of a person, starting from age-group  ̅ = 1 to age − group  ̅ =  , repeating for 10,000 people, grouping together all persons who end in the same degree-bin  ̅ at the last age-group , and for each degreebin  ̅ , calculating the average proportion of partnerships that initiated in age-group  ̅, ∀ ̅.Details of this method are presented in(Singh et al., 2021).
is the rate of change in   , modeled as Markov chain state transitions through non-stationary generator array   .The compartmental model array   for HPV is of size 7 × 3 × 9 × 1 × 41, corresponding to size of age-group, risk-group, degree-bin, geographical jurisdiction, and HPV health state, and thus,   is of size 7 × 3 × 9 × 1 × 41 × 41.To simplify the notation, we define  = [ ̅, ̅ ,  ̅ , ], and rewrite the above equation as  , =  −1, +  =  −1,  , such that  , is a 41 × 41 matrix, representing the Markov chain generator matrix for modeling transitions related to age-group  ̅, risk-group ̅ , degree-bin  ̅ , and geographical jurisdiction .Note, here we only have one jurisdiction, as it is a national model.
, Δ  [∑  ,′  , ′ ,  , ′ , +  ,′ ] ′where, ̅ , is e transmission probability per partnership for a person between HPV positive person with HPV type  and HPV negative person in transmission risk group v, ∆  is the expected contact rate change for persons in transmission risk group v,  ′ is the mixing matrix for a person in risk-group  with persons in risk-group ′ per time unit,  ,′, is the number of people who are infected with HPV-type  infection in transmission risk group ′at time step ,  , ′ is the number of people who are HPV negative in transmission risk group ′ at time step .
, is the transmission probability per sex act for a person between HPV positive person with HPV type  and HPV negative person in transmission risk group v,   is the average number of exposure or average number of sexual acts per partnership for persons in transmission risk group  per unit of time,   is the number of sexual acts per time unit for persons in transmission risk group ,   is the number of sexual partners per time unit for persons in transmission risk group ,  is the proportion of condom protected acts per time unit for persons in transmission risk group , ε is condom effectiveness, defined as the probability of reducing HPV transmission.
is the jurisdiction mixing matrix, element   (,  ′ ) is the probability a person in geographical location  mixes with persons in geographical location  ′ for transmission- is the aging mixing matrix, element   (,  ′ ) is the probability a person in age-group  mixes with persons in age-group  ′ for transmission-group  ∈ {HETF, HETM, MSM}.  is the degree bin mixing matrix, element φ r (,  ′ ) is the probability a person in degree bin  form partnership with persons in degree bin ′ for transmission-group .
(Singh et al., 2021)ompartmental model to represent age-group distribution and risk-group distribution of people in the United States in 2006.We split the population by their HIV-state, age group, degree bin, sexual transmission risk group and geographical location.We initialized the number of people in each age and transmission risk group by distributing individuals to match with the US census data presented in(Singh et al., 2021).The age group distribution does not sum to 1, since we only consider age from 13 to 100 in our model.We assume that 49.03% are heterosexual female, 48.70% as heterosexual male, and 2.27% of the US population are MSM.We modeled births in such way that individuals enter the compartment model by going into the youngest age group where people become sexually active.We assume people start sexually active as early as the age of 13 and hence there is no transmission for HIV and HPV prior to the youngest age in the model.Individuals are susceptible to both HPV and HIV upon their entering the population.We assumed a constant rate of entering the population α (overall birth rate in the US) and we used this rate multiplied by the number of individuals in the lowest age group (age 13 to17) of corresponding transmission risk group, degree bin and geographical location.Individuals leave the compartment model in two different ways: 1) Deaths and 2) aging out of the last age (age 100).Deaths in the compartmental model can result from: 1) cervical cancer 2) all-cause mortality (exclude HIV/AIDS and cervical cancer).We assume the oldest age for HIV and HPV transmission through sexual activity is 65 years, since the age of less than 2% of newly diagnosed HIV-infected individuals were 65 years and above.Mortality that contributes from cervical cancer varies by cancer stage, age and diagnosed status.
Initialization: Births: Aging: Individuals move on to the next age groups of corresponding transmission risk group, degree bin, geographical location and disease state from their current age group at a constant rate equal to inverse of the interval in age group a.Except for the youngest and oldest age group, all compartments experience inflows from the prior age group and outflows into the next age group.The youngest age group (age 13 to 17) receives inflow through births.The transitioning out of the last age group (age 65 to 100) results in their exiting the population.

Table A1 Table of Notations Notation Description
̅; ;  ̅ ; ; ℎ Used when referring to an age-group, risk-group, degree-bin, pseudogeographic jurisdiction and health states, respectively.[,]Astatic adjacency matrix of size   ×   , with static element  [, ]= 1 if  and  are sexual partners anytime during their lifetime and   [, ] =0 A dynamic adjacency matrix of size   ×   , with element   [, ] = 1 if  and  are sexual partners during month  and   [, ] = 0 otherwise.ℯ = {, } An edge in graph   representing a sexual partnership between   and  ({, }) The partnership initiation time; represents the simulation month for partnership initiation.({, }) The partnership termination time; represents the simulation month for partnership termination.{ ̅  ,  ̅  } The age of nodes  and  at the time of their partnership initiation.{  ,   } The age of nodes  and  at the time of their partnership termination. ̅ , Age-group of node  at time . , Age of node j at time t.Degree-bin corresponding to the number of lifetime partners of node .  The actual number of lifetime sexual partners of node . ̂t,j The number of lifetime sexual partners of person  who are already added as nodes in graph G at time t.For infected nodes   =  ̂t,j for susceptible nodes in G,   ≥  ̂t,j  , A partnership distribution matrix of size  × 2, where  , [ ̅, 1] is the number of partnerships that initiate at age-group  ̅, and  , [ ̅, 2] is the number of partnerships that are yet to be assigned.For infected nodes,  , [ ̅, 2] = 0, ∀ ̅  t,j Infection status of node j at time t. , Deceased status of node j at time t. , Care continuum or disease stage of person j at time t. , Infectiousness or risk of transmission per act for person j at time t. , The number of sex acts per month for person j at time t. , The proportion of acts condom protected of person j at time t. −1 () The inverse Bernoulli distribution that takes values 1 with probability  and 0 with probability 1 − .  Random variable for degree of node .Pr (  =   |  =   ) Conditional probability distribution for   .Pr (  =   ) Marginal probability distribution for   .m Minimum degree of the network.   Scale-free network parameter corresponding to the risk-group of node l.  ̅ [ ̅,  ̅ ] A matrix of size  × , representing the proportion of partnerships that initiate at age-group  ̅ for persons in degree-bin  ̅ .The transition probability matrix of size  ×  for age-group  ̅.   ̅ The steady state distribution for lifetime-partners in age-group  ̅.  A vector of size  2 × 1, converted from matrix   ̅ . A matrix of size  ×  2 , recreated with values of   ̅ . A binary matrix of size  ×  2 . A vector of ones of size  × 1. [, ] An age-mixing matrix of size  × , which represents the probability that, given a person is in age-group , his or her partner is in age-group .Varies by risk-group.[, ] A matrix of size  × , representing the expected number of contacts the newly infected node  should have with person in age-group , when node  is in age-group .
, ;  ;   ; ℊ  ;  t,j Used when referring to the age, transmission-group, degree, geographic jurisdiction, and health status, respectively, of node  in in the network at time t.Notations with no t in subscript are variables whose values do no change over time.[̅,,̅ , , ℎ] An array of size  ×  ×  ×  ×  representing the number of susceptible persons in the model, in age-group  ̅, transmission risk group , degree-bin  ̅ , pseudo-geographic jurisdiction , and health state ℎ at time t.(, ℰ)A dynamic graph with  a set of nodes and ℰ a set of edges, at time t.The number of nodes in graph G, at time t.